Voice conversion using General Regression Neural Network

نویسندگان

  • Jagannath H. Nirmal
  • Mukesh A. Zaveri
  • Suprava Patnaik
  • Pramod H. Kachare
چکیده

The objective of voice conversion system is to formulate the mapping function which can transform the source speaker characteristics to that of the target speaker. In this paper, we propose the General Regression Neural Network (GRNN) based model for voice conversion. It is a single pass learning network that makes the training procedure fast and comparatively less time consuming. The proposed system uses the shape of the vocal tract, the shape of the glottal pulse (excitation signal) and long term prosodic features to carry out the voice conversion task. In this paper, the shape of the vocal tract and the shape of source excitation of a particular speaker are represented using Line Spectral Frequencies (LSFs) and Linear Prediction (LP) residual respectively. GRNN is used to obtain the mapping function between the source and target speakers. The direct transformation of the time domain residual using Artificial Neural Network (ANN) causes phase change and generates artifacts in consecutive frames. In order to alleviate it, wavelet packet decomposed coefficients are used to characterize the excitation of the speech signal. The long term prosodic parameters namely, pitch contour (intonation) and the energy profile of the test signal are also modified in relation to that of the target (desired) speaker using the baseline method. The relative performances of the proposed model are compared to voice conversion system based on the state of the art RBF and GMM models using objective and subjective evaluation measures. The evaluation measures show that the proposed GRNN based voice conversion system performs slightly better than the state of the art models. © 2014 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of Linear Regression and Artificial NeuralNetwork for Broiler Chicken Growth Performance Prediction

This study was conducted to investigate the prediction of growth performance using linear regression and artificial neural network (ANN) in broiler chicken. Artificial neural networks (ANNs) are powerful tools for modeling systems in a wide range of applications. The ANN model with a back propagation algorithm successfully learned the relationship between the inputs of metabolizable energy (kca...

متن کامل

Real-time voice conversion using artificial neural networks with rectified linear units

This paper presents an approach to parametric voice conversion that can be used in real-time entertainment applications. The approach is based on spectral mapping using an artificial neural network (ANN) with rectified linear units (ReLU). To overcome the oversmoothing problem a special network configuration is proposed that utilizes temporal states of the speaker. The speech is represented usi...

متن کامل

Emotional Voice Conversion Using Neural Networks with Different Temporal Scales of F0 based on Wavelet Transform

An artificial neural network is one of the most important models for training features of voice conversion (VC) tasks. Typically, neural networks (NNs) are very effective in processing nonlinear features, such as mel cepstral coefficients (MCC) which represent the spectrum features. However, a simple representation for fundamental frequency (F0) is not enough for neural networks to deal with an...

متن کامل

Tone Quality Improvement of Bone Conduction Voice by Cepstrum-based Local Conversion Models

A novel tone quality improvement method for a bone conduction voice is presented. In the present method, the tone quality of the bone conduction voice is converted to the similar quality of the air conduction voice. For the voice conversion, the present method uses a codebook, which consists of various paired code vectors of the bone and air conduction voices. The deltaand mel-cepstral coeffici...

متن کامل

Artificial Neural Network Based Pathological Voice Classification Using Mfcc Features

The analysis of pathological voice is a challenging and an important area of research in speech processing. Acoustic voice analysis can be used to characterize the pathological voices with the aid of the speech signals recorded from the patients. This paper presents a method for the identification and classification of pathological voice using Artificial Neural Network. Multilayer Perceptron Ne...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Appl. Soft Comput.

دوره 24  شماره 

صفحات  -

تاریخ انتشار 2014